Space-precedence does not apply to value-level operators #10597

kazcw · 2024-07-18T13:19:33Z

Pull Request Description

In a sequence of value-level operators, whitespace does not affect relative precedence. Functional operators still follow the space-precedence rules.

The "functional" operators are: >> << |> |>> <| <<| : ., application, and any operator containing <- or ->. All other operators are considered value-level operators.

Asymmetric whitespace can still be used to form operator sections of value-level operators, e.g. +2 * 3 is still equivalent to x -> (x+2) * 3.

Precedence of application is unchanged, so f x+y is still equivalent to f (x + y) and f x+y * z is still equivalent to (f (x + y)) * z.

Any attempt to use spacing to override value-level operator precedence will be caught by the new enso linter. Mixed spacing (for clarity) in value-operator expressions is allowed, as long as it is consistent with the precedences of the operators.

Closes #10366.

Important Notes

Precedence warnings:

The parser emits a warning if the whitespace in an expression is inconsistent with its effective precedence.
A new enso linter can be run with ./run libraries lint. It parses all .enso files in distribution/lib and test, and reports any errors or warnings. It can also be run on individual files: cargo run --release --bin check_syntax -- file1 file2... (the result may be easier to read than the ./run output).
The linter is also run as part of ./run lint, so it is checked in CI.

Additional language change:

The exponentiation operator (^) now has higher precedence than the multiplication class (*, /, %). This change did not affect any current enso files.

Library changes:

The libraries have been updated. The new warnings were used to identify all affected code; the changes themselves have not been programmatically verified (in many cases their equivalence relies on the commutativity of string concatenation).

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

The documentation has been updated, if necessary.
Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
All code follows the
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
Unit tests have been written where possible.

kazcw · 2024-07-18T13:30:52Z

lib/rust/parser/debug/tests/parse.rs

@@ -841,13 +851,13 @@ fn operator_sections() {
 #[test]
 fn template_functions() {
    #[rustfmt::skip]
-    test("_.map (_+2 * 3) _*7", block![
+    test("_.map (_ + 2*3) _*7", block![


The semantics of this test have changed, because the original syntax can no longer be expressed with template functions.

JaroslavTulach

I'd rather see linting integrated into enso CLI interface
I don't see any way warnings are propagated to CLI however
when at the spacing issues, don't you want to fix Expressions not being run if indented by 1 more space than current indentation level #9419?

JaroslavTulach · 2024-07-19T02:17:07Z

...tegration-tests/src/test/scala/org/enso/interpreter/test/semantic/SimpleArithmeticTest.scala

@@ -44,7 +44,7 @@ class SimpleArithmeticTest extends InterpreterTest {
      val code =
        """import Standard.Base.Data.Numbers
          |main =
-          |    expr = (10 * 3+2) - 1
+          |    expr = 10 * (3+2) - 1


Only a single fix in runtime-integration-tests? Good.

JaroslavTulach · 2024-07-19T02:18:36Z

test/Table_Tests/src/IO/Excel_Spec.enso

@@ -640,15 +640,15 @@ spec_write suite_builder suffix test_sheet_name =
            parent = enso_project.data / "transient" / "nonexistent"
            parent.exists.should_be_false

-            f = parent / "foo."+suffix
+            f = parent / ("foo."+suffix)


This one is unfortunate. It has nothing to do with arithmetic and looks really bad surrounded by ( and ).

Eventually we will have string interpolation, right? The parser supports it.

Yes please - lets agree how we want to do that :)

Eventually we will have string interpolation, right? The parser supports it.

Interesting. I didn't know we are supposed to support string interpolation. Where's the spec? What's the syntax? What Tree will one get?

Both main x = 'x=$x' and main x = "x=$x" resulted in a simple text literal on the IR level...

https://github.com/enso-org/design/blob/wip/wd/enso-spec/epics/enso-spec-1.0/04.%20Expressions.md#interpolated-text-literals
The elements returned by getElements can return a TextElement.Splice, but we flatten all of the elements into a string in TreeToIr to fit the current IR representation.
We also flatten in the GUI. When the backend supports splices, in the frontend we would probably want to display the spliced expression with widgets, but we would need to be careful that it is clear that it is part of the string expression.

JaroslavTulach · 2024-07-19T02:20:57Z

app/gui2/src/util/ast/__tests__/__snapshots__/raw.test.ts.snap

@@ -50,6 +50,7 @@ exports[`Parsing '2
                "spanLeftOffsetCodeStartLine": 1,
                "spanLeftOffsetCodeStartUtf8": 4,
                "type": "Number",
+                "warnings": [],


Do I read this correctly that warnings are propagated to TypeScript. How are the warnings propagated to CLI? I don't see any changes in Tree. How will the CLI report these warnings?

There is Tree.getWarnings, however they are serialized as warning IDs (plus warning-specific details), with a separate table providing the string/template to display each warning ID. I have not yet implemented the JNI to turn this in to a getMessage for the Warning object.

In order to expose the linter via enso-engine*/bin/enso we will need a way to access the warnings from JVM.

Yes, expect a PR soon. I just wanted to get this part merged since it touches a lot of files.

JaroslavTulach · 2024-07-19T02:25:01Z

build/build/src/rust/enso_linter.rs

+const ENSO_EXTENSION: &str = ".enso";
+
+pub fn cargo_run_linter_cmd(repo_root: &Path) -> Result<Command> {
+    let mut ret = Cargo.cmd()?;


While having such a linter may satisfy our current development needs, I believe the right linter shall be integrated into the enso command.

There is enso --compile that is being invoked as part of CI already. We don't need a new CI task (in my opinion), but rather we should make sure that enso --compile (maybe with --lint) fails the CI when the formatting isn't correct.

It's important for the compiler to be able to emit syntax warnings. I think it's also useful to have a simple standalone linter--when linting fails in CI, it's nice if it happens right away, not after building the engine. It isn't a new CI task; it's part of ./run lint.

it's also useful to have a simple standalone linter

Indeed. However such a linter has to be available to Enso users, not just to us. E.g. it cannot be part of ./run lint. It has to be part of enso-engine*/bin/enso.

How will we expose the linter to Enso end users? What needs to be changed to make that happen? Can you report an issue describing what needs to be done? Can you implement it (and remove current ./run lint temporary solution) or shall JVM guys in engine team do it?

The standalone linter is only for us, to support our CI process. It can provide results before the engine is finished building. For that purpose, it cannot be part of enso-engine*/bin/enso. End users will receive warnings from enso-engine*/bin/enso.

kazcw · 2024-07-19T12:27:42Z

I'd rather see linting integrated into enso CLI interface

We'll do that too. The separate linter came for free (I needed to write it to test the warnings and update the libs), and was needed for CI.

* when at the spacing issues, don't you want to fix [Expressions not being run if indented by 1 more space than current indentation level #9419](https://github.com/enso-org/enso/issues/9419)?

Line indentation and spacing between subexpressions are handled in different parts of the parser (block-building happens long before precedence-resolution).

jdunkerley

Approving the libs changes.

farmaazon · 2024-07-23T09:16:25Z

lib/rust/parser/debug/src/bin/bench_parse.rs

+fn main() {
+    let args = std::env::args().skip(1);
+    let parser = enso_parser::Parser::new();
+    let parse_time: std::time::Duration = args
+        .map(|path| {
+            let code = std::fs::read_to_string(path).unwrap();
+            let mut code = code.as_str();
+            if let Some((_meta, code_)) = enso_parser::metadata::parse(code) {
+                code = code_;
+            }
+            let start = std::time::Instant::now();
+            std::hint::black_box(parser.run(code));
+            start.elapsed()
+        })
+        .sum();
+    println!("Parse time: {} ms", parse_time.as_millis());
+}


Technically, we could use standard rust benchmark tests, as we still use nightly rust. But it may be done later, of course.

lib/rust/parser/src/lexer.rs

farmaazon · 2024-07-23T09:27:00Z

lib/rust/parser/src/syntax/tree.rs

+#[derive(Clone, Debug, Eq, PartialEq, Serialize, Reflect, Deserialize)]
+#[allow(missing_copy_implementations)] // Future warnings may have attached information.
+pub struct Warning {
+    id: u32,


Why we don't use just WarningId here (And make WarningId #[repr(u32)])?

This representation is designed for FFI. If we used the parser in Rust I would use an enum with parameters, with serialization overridden to produce this representation. Since the Rust API is only used by debug tools, I cut out the Rust-friendly type.

farmaazon · 2024-07-23T09:30:45Z

lib/rust/parser/src/syntax/tree.rs

+/// Template strings for printing warnings.
+// These must be defined in the same order as the [`WarningId`] variants.
+pub const WARNINGS: [&str; WarningId::NUM_WARNINGS as usize] =
+    ["Spacing is inconsistent with operator precedence"];


How about a function with a single match? It would not require NUM_WARNINGS, and is easier to read (no longer need to care about proper order), and - if written properly - will optimize to array of &str anyway.

The idea is to send the table across the FFI boundary, and then stringify warnings in the GC'd languages. This avoids FFI calls, encoding conversion, and allocations when emitting warnings. In the future we might move the warning table to a JSON file (which could be passed to some codegen) to further optimize this process and also make the messages easier to edit (or even localize).

farmaazon · 2024-07-23T09:32:55Z

lib/rust/prelude/src/vec.rs

+pub struct ColdVec<T> {
+    #[allow(clippy::box_collection)]
+    #[reflect(as = "Vec<T>")]
+    elements: Option<Box<Vec<T>>>,


That is how this struct fulfills its purpose. It has a shallow size of 1 pointer, and when unused that is its total size. A Vec has a shallow size of 3 pointers, even if unused.

In a sequence of value-level operators, whitespace does not affect relative precedence. Functional operators still follow the space-precedence rules. The "functional" operators are: `>> << |> |>> <| <<| : .`, application, and any operator containing `<-` or `->`. All other operators are considered value-level operators. Asymmetric whitespace can still be used to form *operator sections* of value-level operators, e.g. `+2 * 3` is still equivalent to `x -> (x+2) * 3`. Precedence of application is unchanged, so `f x+y` is still equivalent to `f (x + y)` and `f x+y * z` is still equivalent to `(f (x + y)) * z`. Any attempt to use spacing to override value-level operator precedence will be caught by the new enso linter. Mixed spacing (for clarity) in value-operator expressions is allowed, as long as it is consistent with the precedences of the operators. Closes #10366. # Important Notes Precedence warnings: - The parser emits a warning if the whitespace in an expression is inconsistent with its effective precedence. - A new enso linter can be run with `./run libraries lint`. It parses all `.enso` files in `distribution/lib` and `test`, and reports any errors or warnings. It can also be run on individual files: `cargo run --release --bin check_syntax -- file1 file2...` (the result may be easier to read than the `./run` output). - The linter is also run as part of `./run lint`, so it is checked in CI. Additional language change: - The exponentiation operator (`^`) now has higher precedence than the multiplication class (`*`, `/`, `%`). This change did not affect any current enso files. Library changes: - The libraries have been updated. The new warnings were used to identify all affected code; the changes themselves have not been programmatically verified (in many cases their equivalence relies on the commutativity of string concatenation). (cherry picked from commit e5b85bf)

kazcw commented Jul 18, 2024

View reviewed changes

kazcw added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Jul 18, 2024

kazcw force-pushed the wip/kw/math-precedence branch 3 times, most recently from 06a0360 to 35f3e1a Compare July 18, 2024 16:41

kazcw self-assigned this Jul 18, 2024

kazcw force-pushed the wip/kw/math-precedence branch 6 times, most recently from fd48f03 to 491518a Compare July 18, 2024 18:12

Space-precedence applies to functional operators, but not value-level

37790bf

kazcw force-pushed the wip/kw/math-precedence branch from 491518a to 37790bf Compare July 18, 2024 19:46

kazcw marked this pull request as ready for review July 18, 2024 19:47

kazcw requested review from jdunkerley, radeusgd, GregoryTravis, AdRiley, marthasharkey, 4e6, JaroslavTulach, hubertp, Akirathan, Frizi, farmaazon and vitvakatu as code owners July 18, 2024 19:47

JaroslavTulach reviewed Jul 19, 2024

View reviewed changes

enso-bot bot mentioned this pull request Jul 20, 2024

StackOverflow when multiple Managed Resources are being cleaned up at the same time #10211

Closed

jdunkerley approved these changes Jul 22, 2024

View reviewed changes

JaroslavTulach self-requested a review July 22, 2024 15:55

JaroslavTulach approved these changes Jul 22, 2024

View reviewed changes

AdRiley mentioned this pull request Jul 23, 2024

July 2024 Release Requirements #10638

Closed

farmaazon approved these changes Jul 23, 2024

View reviewed changes

Apply @farmaazon review

134ce83

kazcw added the CI: Ready to merge This PR is eligible for automatic merge label Jul 23, 2024

kazcw mentioned this pull request Jul 24, 2024

Provide syntax warnings to Java #10645

Merged

4 tasks

jdunkerley added the CI: Keep up to date Automatically update this PR to the latest develop. label Jul 24, 2024

Merge branch 'develop' into wip/kw/math-precedence

abffb74

jdunkerley removed the CI: Keep up to date Automatically update this PR to the latest develop. label Jul 24, 2024

mergify bot merged commit e5b85bf into develop Jul 24, 2024
42 checks passed

mergify bot deleted the wip/kw/math-precedence branch July 24, 2024 10:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Space-precedence does not apply to value-level operators #10597

Space-precedence does not apply to value-level operators #10597

kazcw commented Jul 18, 2024 •

edited

Loading

kazcw Jul 18, 2024

JaroslavTulach left a comment

JaroslavTulach Jul 19, 2024

JaroslavTulach Jul 19, 2024

kazcw Jul 19, 2024

jdunkerley Jul 22, 2024

JaroslavTulach Jul 22, 2024

kazcw Jul 23, 2024

JaroslavTulach Jul 19, 2024

kazcw Jul 19, 2024

JaroslavTulach Jul 22, 2024

kazcw Jul 23, 2024

JaroslavTulach Jul 19, 2024

kazcw Jul 19, 2024 •

edited

Loading

JaroslavTulach Jul 22, 2024

kazcw Jul 23, 2024

kazcw commented Jul 19, 2024 •

edited

Loading

jdunkerley left a comment

farmaazon Jul 23, 2024

farmaazon Jul 23, 2024

kazcw Jul 23, 2024

farmaazon Jul 23, 2024

kazcw Jul 23, 2024

farmaazon Jul 23, 2024

kazcw Jul 23, 2024

Space-precedence does not apply to value-level operators #10597

Space-precedence does not apply to value-level operators #10597

Conversation

kazcw commented Jul 18, 2024 • edited Loading

Pull Request Description

Important Notes

Checklist

Choose a reason for hiding this comment

JaroslavTulach left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kazcw Jul 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kazcw commented Jul 19, 2024 • edited Loading

jdunkerley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kazcw commented Jul 18, 2024 •

edited

Loading

kazcw Jul 19, 2024 •

edited

Loading

kazcw commented Jul 19, 2024 •

edited

Loading